Learning Continuous Phrase Representations for Translation Modeling

نویسندگان

Jianfeng Gao

Xiaodong He

Wen-tau Yih

Li Deng

چکیده

This paper tackles the sparsity problem in estimating phrase translation probabilities by learning continuous phrase representations, whose distributed nature enables the sharing of related phrases in their representations. A pair of source and target phrases are projected into continuous-valued vector representations in a low-dimensional latent space, where their translation score is computed by the distance between the pair in this new space. The projection is performed by a neural network whose weights are learned on parallel training data. Experimental evaluation has been performed on two WMT translation tasks. Our best result improves the performance of a state-of-the-art phrase-based statistical machine translation system trained on WMT 2012 French-English data by up to 1.3 BLEU points.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning continuous-valued word representations for phrase break prediction

Phrase break prediction is the first step in modeling prosody for text-to-speech systems (TTS). Traditional methods of phrase break prediction have used discrete linguistic representations (like POS tags, induced POS tags, word-terminal syllables) for modeling these breaks. However these discrete representations suffer from a number of issues such as fixing the number of discrete classes and al...

متن کامل

Learning Semantic Representations for the Phrase Translation Model

This paper presents a novel semantic-based phrase translation model. A pair of source and target phrases are projected into continuous-valued vector representations in a lowdimensional latent semantic space, where their translation score is computed by the distance between the pair in this new space. The projection is performed by a multi-layer neural network whose weights are learned on parall...

متن کامل

Learning Phrase Embeddings from Paraphrases with GRUs

Learning phrase representations has been widely explored in many Natural Language Processing (NLP) tasks (e.g., Sentiment Analysis, Machine Translation) and has shown promising improvements. Previous studies either learn noncompositional phrase representations with general word embedding learning techniques or learn compositional phrase representations based on syntactic structures, which eithe...

متن کامل

Learning Bilingual Distributed Phrase Representations for Statistical Machine Translation

Following the idea of using distributed semantic representations to facilitate the computation of semantic similarity between translation equivalents, we propose a novel framework to learn bilingual distributed phrase representations for machine translation. We first induce vector representations for words in the source and target language respectively, in their own semantic space. These word v...

متن کامل

Learning Semantic Representations for Nonterminals in Hierarchical Phrase-Based Translation

In hierarchical phrase-based translation, coarse-grained nonterminal Xs may generate inappropriate translations due to the lack of sufficient information for phrasal substitution. In this paper we propose a framework to refine nonterminals in hierarchical translation rules with real-valued semantic representations. The semantic representations are learned via a weighted mean value and a minimum...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Learning Continuous Phrase Representations for Translation Modeling

نویسندگان

چکیده

منابع مشابه

Learning continuous-valued word representations for phrase break prediction

Learning Semantic Representations for the Phrase Translation Model

Learning Phrase Embeddings from Paraphrases with GRUs

Learning Bilingual Distributed Phrase Representations for Statistical Machine Translation

Learning Semantic Representations for Nonterminals in Hierarchical Phrase-Based Translation

عنوان ژورنال:

اشتراک گذاری